Temporal and dimensional effects in evolutionary graph theory 
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The spread in time of a mutation through a population is studied analytically and computationally 
in fully-connected networks and on spatial lattices. The time, i, , for a favourable mutation to 
dominate scales with population size A'^ as A^(^+'^)/^ in D- dimensional hypercubic lattices and as 
A'^ In A'^ in fully-connected graphs. It is shown that the surface of the interface between mutants and 
non-mutants is crucial in predicting the dynamics of the system. Network topology has a significant 
effect on the equilibrium fitness of a simple population model incorporating multiple mutations and 
sexual reproduction. Includes supplementary information. 
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Population models have been used to study various 
problems in evolutionary biology [li l2, Isf. Generally, 
these models assume a non-spatial population [J, Is], lo, LZf . 
As competition in the wild is generally local and spatial 
[8| , the impact of this assumption needs to be examined 
carefully. There is evidence that spatially subdividing 
populations can change the dynamics of alleles spreading 
through a population [3, [lOj llll • Spatial structure has 
also been shown to change the dynamics of co-operative 
games[il,[il. 

Recently, Lieberman et al.[lj] introduced 'evolution- 
ary graph theory' to allow the effects of spatial structure 
on evolutionary dynamics to be studied directly, placing 
Moran's process[l5J on a network. On a 2Z3-network, 
this model is equivalent to the Williams-Bjerknes model 
[16]. Lieberman et al. showed that, in networks charac- 
terized by a symmetric stochastic matrix, mutants have a 
fixation probability that depends on the mutant fitness, 
r (defined relative to a background wild-type fitness of 
unity), and on the number of nodes in the system, N , 
but which is the same for all members of that class of 
networks. Other networks, such as scale-free networks. 



were found to lead to different fixation probabilities [14 1 . 

The main aim of this paper is to investigate analyti- 
cally and numerically the dynamics of evolution in net- 
works with symmetric stochastic matrices. Wc demon- 
strate that the mutant fixation time significantly depends 
on the network topology and dimensionality (as well as N 
and r) , despite the fact that the fixation probability in all 
these networks is the same in the limit of infinite time. 
In a related model, allowing for simultaneous multiple 
mutations, the equilibrium fitness is defined by a balance 
equation with a typical time much less than the fixation 
time and hence the equilibrium fitness of a population 
model depends on the Euclidean dimensionality of the 
network. Thus, approximating realistic spatial networks 
by fully-connected graphs results in an appreciable over- 
estimate of the equilibrium fitness. We also demonstrate 
that the topology of the network affects differently the 
equilibrium fitness of sexual and asexual populations. 



In 'evolutionary graph theory' [lj|, each vertex of a 
graph represents an individual with fitness r for a mutant 
and 1 for a wild- type organism. On each turn (time step), 
an individual on a node i is selected for reproduction, 
with a probability linearly proportional to its fitness. A 
clone of the selected individual is placed onto one of the 
nodes, j, connected to it by the network which is chosen 
with a probability, w^-, given by the stochastic matrix. 
The individual previously at j is replaced, conserving the 
population size. Fixation of the mutant gene occurs when 
the number of mutants, m, reaches N , and extinction 
occurs when to = 0. 

Here, we consider the special class of graphs with fixed 



node connectivity, Z, and Wi 



1/Z for connected 



nodes and Wij — otherwise, and analyse the spread of 
an advantageous mutation (r > 1). 

The number of mutants, m(t), increases, decreases 
or remains unchanged in each time step with probabili- 
ties, p|, pi, and 1 — Pi — Pi, respectively. Changes oc- 
cur at the interface consisting of A links between mu- 
tants and wild-types. The value of p-\ is given by the 
probability that a given mutant reproduces in the turn, 
Pm.rcpr — T / [mr + {N — to) X 1], multiplied by the num- 
ber of mutants and p™. inter — A/{mZ), which is the 
probability that the mutant passes its offspring across 
the interface, i.e. p^ — {A/Z)r/[mr + {N — m)]. Sim- 
ilarly, pi = (A/Z) X l/(TOr -I- (A^ - to)). Taking the 
deterministic (fluctuations are ignored), continuous-time 
and continuous-mutant population number approxima- 
tion (1 ^ TO, ^ N), we obtain the following equation 



dm 
'df 



Rt — R 



A 



1 



Z mr + (A^ — m) 



(1) 



where Rui) = P|(n/At (with At — 1 being the simu- 
lation time step length), i.e. the change in the mutant 
population per time step is, on average, equal to the prob- 
ability that the population grows minus the probability 
that it decreases. 

The number of links between mutants and non- 



TABLE I: Values for 7 found from fits to Eq. (O with 16 
realisations used for each fit. In the 2D case, fits were per- 
formed with between 60,000 and 150,000 mutants; in the 3D 
case, between 45,000 and 75,000 mutants were used. 



Mutant fitness, r 



Square lattice, 7 Cubic lattice, 7 



1.5 
2 
3 
100 



0.513 ±0.014 
0.504 ± 0.008 
0.512 ± 0.008 
0.509 ± 0.006 



0.684 ±0.012 
0.700 ±0.012 
0.692 ±0.012 
0.698 ± 0.007 



mutants may vary with m{t), and for some networks, the 
functional form of A(m) is known exactly. For example, 
in the linear chain {Z = 2, periodic boundaries), A = 2 
for m < N and A = for m — N, and hence the solution 
of Eq. ([T^ . with m(0) — 1, is straightforward. 



m{t) 



for t <tf^ N'^{ 




2t, 



(2) 



1), where the upper limit 



t{ is the fixation time which scales quadratically with N. 
In the case of a fully-connected graph {Z — N — 1), 
A = ra{N — to), and the rate equation can be solved 
implicitly, resulting in (for to,(0) = 1): 



t 



{N-l) 
(r-1) 



In 



[N - lym 



{N - my 



(3) 



Simulations support these analytical expressions (see 
Fig. [Ha) and (d)). The continuum approximation used 
in Eq. (fT2|) breaks down for to, ^ or m — > iV. This 
causes large fluctuations at the initial stage of invasion, 
which lead to displacements of m{t) along the time axis 
(see Fig. [1]), being equivalent to variations in the initial 
condition. The shape of the curve does not change and 
agrees well with the analytical predictions. It might be 
biologically important that the size of the initial fluctu- 
ations increases with decreasing mutant fitness [l8|, as 
can be clearly seen from Fig. [TJd). The fluctuations, 
which separate the curves of each simulation run are of 
greater consequence in the high-dimension systems where 
the overall time-scales are shorter. 

For square [Z = 4) and cubic {Z = 6) lattices, the rela- 
tionship between A and m is multi- valued and unknown. 
The current belief is that the mut ant /non- mutant inter- 
face is a self-affine surface with 



A^ Pm? 



(4) 



and ^ — [D — i)/D ^], which is consistent with our 
numerical data (see Table HI . 

Using the numerical estimates for /3 and 7 = (D — 
1)/D, the rate Eq. (fT^ can be solved implicitly for 2D- 
and 3Z)-lattices {m{0) = 1), 
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FIG. 1: (Color online) Number of mutants, m, vs. time, t, 
in the models defined on networks of different topologies: (a) 
linear chain, (b) square lattice, (c) cubic lattice and (d) fully- 
connected graph. Predictions (solid black line) are compared 
to data from six simulation runs (dashed red line). The plots 
marked '*' are for mutants of r = 3, '#' corresponds to r = 2, 
and '±' to r = 1.5. A'^ = 250, 000 has been used in simulations 
for all networks except (c) with N — 250, 047. The theoretical 
plot in (a) overlays the simulation data. 



Numerical simulations are in better agreement with an- 
alytical predictions in the case of 2D- than 3Z)-lattices. 
Both predictions fail when edge effects (above the in- 
flexion points in Figs. [1] (b, c) for simulations) become 
important, i.e. at sufficiently large times. The system- 
atic deviation, overestimating the mutant spread speed, 
of the simulated data from analytical predictions is due 
to the violation of Eq. ([13]) (see Table |T| for small times, 
when the number of mutants is small and one is far from 
the asymptotic regime. This effect is more significant in 
the cubic lattice, which tallies with work on Eden growth 
exhibiting rnore pronounced crossover effects in higher 
dimensions [20] . These deviations are also more substan- 
tial when mutants have a smaller comparative fitness (cf. 
curves for different values of r in Fig. [T] (b, c)). Real 
favourable mutations generally confer an advantage far 
smaller than 50% (corresponding to r = 1.5 in Fig. [T]) 
[l8|, and thus this discrepancy, and early fluctuations, 
are potentially of importance in biological systems. The 
less-biased errors from the approximations in Eq. (|12p 
are present in all the simulations, and, for a given r, are 
of similar magnitude in time - the spread of data in time 
is of a similar magnitude in the body of the plot for all 
the system types, but shows up most in fully-connected 
systems which exhibit the shortest time to fixation. 

It follows from Eqs.(2)-(5) that the typical time, i*, 
taken for a mutant to dominate the graph (i.e. to occupy 
a large fraction of sites, m{t^,) ^^ N) depends strongly 
on the network topology. In a fully-connected graph. 
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FIG. 2: (Color online) Relationship between scaling expo- 
nents for fixation time and population size: (a) Linear chain; 
(b) Square lattice; (c) Cubic lattice; (d) Fully-connected 
graph. For (a)-(c), theory predicts tf oc A^''. Using simu- 
lations for system sizes spaced at InA'^ = 0.5 intervals, least- 
square fits were performed over three adjacent times to obtain 
rj. Each point is therefore an approximation to the gradient 
of a In tf vs In TV plot at the point. The predicted scaling 
{rj = D/{D + 1)) is shown with dashed lines. For (d), the 
theory predicts that the time to fixation on a fully-connected 
graph divided by NlnN will be constant. Between 100 and 
3300 data points were averaged for each value of tf. 



t* scales as NlnN, and, in a Z?-dimensional Euclidian 
network, as A^(^+i)/-D Fig.[2]shows the approach to this 
predicted scaling. The exponents for linear chains (a) and 
fully-connected graphs (d) approach the predictions at 
very small sizes compared to 2D- and 3D-lattices, which 
is due to the finite-size effects inherent in using Eq. [T51 

The above analysis was for a very simple model, with 
only one mutation present at any time. However, in re- 
alistic biological systems, multiple mutations can occur 
and interact. To investigate these effects, we have mod- 
ified the model by allowing offspring not to be perfect 
clones of their parents. In such a model, the network 
topology has a significant impact on the evolved, equi- 
librium fitness of the population. We also look at the 
effects of sexual reproduction, with Mendelian recombi- 
nation rules, and demonstrate that network topology also 
affects sexual populations, but in a different way to asex- 
ual ones. 

In the asexual model with multiple mutations, each 
time that an offspring is produced, it is subject to a pos- 
sible mutation. Mutations occur with a probability, /i, 
and increase the fitness with probability p, or decrease it 
with probability 1 — p. The fitness increment/decrement 
is the same, Ao, i.e. offspring are produced from clones 
of their parents, but with a stochastic change in their fit- 
ness Toffspring = ?'paront + A. The valuc of A is drawn from 



a probability distribution function, p(A), of the form, 

p(A) = il-fi)S{A) + fi[il-p)S{A + Ao)+pSiA-Ao)] , 

(6) 
(with Ag > 0). In this model (as we have observed nu- 
merically), the population reaches an equilibrium fitness. 



caused by mutation-selection balance [2l|, proportional 
to Aq provided that the initial dynamics do not lead 
to extinction (in which all individuals' fitnesses reach 
zero) and that p < 0.5 (with more positive mutations 
than negative mutations, the fitness tends to infinity). 
In the fully-connected system, this is the model studied 
in Ref . |^ . Fig. [3] shows that the equilibrium fitness 
of asexual populations significantly depends on the di- 
mensionality of the network as well as on the popula- 
tion size (23|, with higher dimensional systems exhibit- 
ing greater equilibrium fitnesses. The networks found 
(Fig. [T|) to have a slower spread of mutants lead to a lower 
fitness in the model with multiple mutations. The slower 
spread means that individuals compete with closely re- 
lated neighbours and when a fit individual reproduces, it 
is more likely to replace another fit individual. It can be 
seen from Fig. [3] that the fitness increase with N satu- 
rates for the spatial systems. For fully-connected systems 
of up to 250,000 nodes, no saturation was observed (cf. 
Ref. ^). 
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FIG. 3: (Color online) Relationship of equilibrium fitness 
to population size, A'^, and network topology in an asexual 
population for ^ = 0.4, Ao = 1 and p = 0.1 (cf. Eq. fflT} ]. 
Data were averaged over 50 realizations and errors are smaller 
than the symbols. Lines are guides to the eye only. The 
linear-chain and square-lattice systems use the left and bot- 
tom axes; the cubic-lattice and fully-connected systems use 
the top and right axes. Initial condition: all fitnesses set to 
unity. The numbers measuring equilibrium fitness are not di- 
rect measures of biological reproduction rates, being scaled 
by Ao. 

To study a sexual population, it is necessary to intro- 
duce multiple genes per individual, allowing recombina- 



tion. We study an additive model (see e.g. 24|) in which 
each individual has G genes. Each gene is characterised 
by a number (quality factor), and the fitness of the indi- 
vidual is the sum of these quality factors. When an indi- 
vidual is chosen for reproduction, a partner is selected at 
random from the nodes connected to it and the offspring 
is produced by a Mendelian shuffling of the genes. The 
above criteria define hermaphrodite haploids (bisexual 
individuals with a single copy of each gene). Mutations 
are applied as in the asexual system, but now each gene 
is subject to a possible mutation. In Fig. 21 the equihb- 
rium fitness of the sexual population versus N is shown 
for different network topologies, demonstrating that the 
dimensionality of the network also affects the fitness of 
sexual populations, with higher dimensionality networks 
again leading to increased fitnesses. However, the effect is 
not as large as for the asexual populations. A saturation 
equilibrium fitness with A'^ was not observed in the 2Z3-, 
3-D-lattices or fully-connected systems for N < 250, 000 
nodes. The sexual populations were found to have higher 
equilibrium fitnesses than the asexual fitnesses for the pa- 
rameter range G > 10,Ai > 0.01, p < 0.1,7V > 200. The 
general principle that the advantage of sex is greater (and 
the fitness is lower) in lattices than in fully-connected 
graphs holds over a range of variables. 

It should be noted that in an asexual system created 
by suppressing the recombination in a sexual system (i.e. 
an asexual population with genes) the fitness depends 
only on the product ^G. The populations in Figs [3]|4] are 
therefore directly comparable, and the differences are due 
to the presence or absence of recombination. The 'slower' 
networks still see the fitness disadvantage, as with the 
asexual system. However, the correlations in the popula- 
tion fitness (a fit neighbour being surrounded by other fit 
individuals) now give a benefit as well as a disadvantage 
- a fit individual is more likely to mate with a fit indi- 
vidual as well as to replace one with her offspring. The 
sexual process may also reduce the degree of spatial frus- 
tration, as genes can pass to second-nearest neighbours, 
potentially speeding up the spread of mutations. A larger 
advantage for sex does not necessarily mean, however, 
that the establishment/maintenance of sex would occur 
more readily [25'| in a spatial network than in the fully 
connected systems usually studied (to be discussed else- 
where) . 

To conclude, we have demonstrated that the spatial 
nature of populations (dimensionality of the population 
network) is of significant importance for the dynamics 
of evolution. The following effects have been found for 
populations defined on networks of higher dimensionality 
(e.g. fully-connected graph as compared to a square lat- 
tice): (i) faster spread of a mutation, (ii) greater values 
of equilibrium fitness in both sexual and asexual popu- 
lations, in a model in which mutations do not have time 
to fixate before the next mutation is introduced into the 
system. Whilst both sexual and asexual populations have 
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FIG. 4: (Color online) Relationship of equilibrium fitness to 
population size, A'^, and network topology in a sexual popu- 
lation for ^ = 0.04, Ao = 1 and p = 0.1 (c.f. Eq. ^) and 
G = 10 (so that ^G = 0.4 has the same value as fj, in Fig. [3|. 
The data were averaged over 50 realizations and the errors are 
smaller than the symbols used. Lines are guides to the eye 
only. Initial condition: all gene quality factors set to unity. 



increased fitnesses for higher dimensional networks, the 
effect of topology on each is markedly different. 
CJP thanks the EPSRC for financial support. 
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SUPPLEMENTARY INFORMATION 

THE DISCRETE ORIGINS OF THE MASTER 
EQUATION 

Each individual on the graph has a fitness, 1 (wild- 
type) or r (mutant) in this part of the paper, associated 
with it. On each turn, an individual is selected for re- 
production, with a probability proportional to its fitness. 
A clone of the selected individual is placed onto one of 
the nodes connected to it by the network. The individual 
that previously occupied the node is replaced, conserving 
the population size, N. 

For the spread of mutants, the important events are 
those that occur at the interface between mutants and 
non- mutants. At a given connection between a mutant 
and a non-mutant node, the probability that an event 
occurs at one of the nodes in a turn is: 



1 



Pc 



mr + [N — m) 



(7) 



At a given interface link there must be a wild type (fitness 
of unity) and mutant individual (r). The probability of 
selecting one of these is normalised by the total mutant 
(ra X r) and wild type {[N — m] x 1) fitness. Given that 
an event has happened at a node linked to a particular 
connection between mutant and non- mutant, the proba- 
bility that the offspring is placed across this connection 
is l/Z where Z is the connectivity of each node. In to- 
tal, there are A links between mutants and non- mutants. 
The probability that, in a given turn, an event happens 
across the boundary is: 



A 



Pboundary 



l + r 



Z mr + {N — m) 



(8) 



Therefore, the probability that, in a given turn, the 
population of mutants increases by one is: 



A 



l + r 



PT 



(9) 



Z mr + [N — m) l + r 

The probability that the mutant population shrinks by 
one is 

A l + r 1 



Pi 



Z mr + [N — m) l + r 



(10) 



This can be written as a discrete master equation, 
where the probability that there are m mutants in the 
system at time t is given by: 

Pm,t =PtPm-l,t-l +PlP,n+l,t-l + (1 - PT ^Pl)Pm.t-l ■ 

(11) 

Alternatively, the continuum, deterministic approxi- 
mation can be taken, where the probability of increase 
minus the probability of decrease per time-step in mutant 
population leads to an average rate of increase (valid for 
1 << m « N): 



dm 
It 



R^ - Rl 



A 



r-l 



Z mr + {N — m) 



(12) 



where i?|(|) = P|(|)/Ai (with Ai = 1 being the simula- 
tion time step length). This is equation (1) in [26|. 



DERIVATION OF THE RELATION BETWEEN 

THE NUMBER OF MUTANTS AND TIME ON A 

LATTICE 

We use the approximation, 

A = /?toT , (13) 

and place this into equation (I12p before integrating. 

mr+{N-m),^ [' (3 . ^,,^ ,^^, 
m^ ^dm= / -(r-l)di. (14) 



This integral can be solved to give 
{r-iy— +N 



-(r-l)— iV^_ = ^(r-l)i, 

2-7 1-7 2-7 1-7 Z^ ' ' 

„(15) 



which can be re-arranged to give equation (5) in [2£ 



EXPLANATION OF THE EQUATION 

DESCRIBING THE INTRODUCTION OF NEW 

MUTATIONS 

In the second part of the paper, a model is introduced 
in which parents do not necessarily produce perfect off- 
spring. The offspring's fitness is related to that of the 
parent by the following formula: 



'"offspring 



r 



parent 



(16) 



The difference in the fitnesses. A, is taken from a prob- 
ability distribution, 

p(A) = (l-/^),5(A) + ^(l-p)(5(A + Ao) + A*P<^(A-Ao) . 

(17) 
This distribution consists of three weighted delta- 
functions, representing the three discrete possibilities. 



With probability y^(l — p) (the probabihty that a muta- the offspring will have a fitness of Aq more than that 

tion occurs multiplied by the probability that it is dele- of their parent, representing an advantageous mutation, 

terious), the offspring will have a fitness of Aq less than With probability 1 — /i, no mutation will have occurred 

that of their parent (A = — Aq). With probability /ip, and the offspring will have the same fitness as the parent. 



